Search Result

Select

Resilience computation for queries with inequalities in databases

LIN Jie, QIN Biao, QIN Xiongpai

Journal of Computer Applications 2018, 38 (7): 1893-1897. DOI: 10.11772/j.issn.1001-9081.2018010078

Abstract （640）

PDF （941KB）（267）

Save

Focusing on the causality problem of conjunctive queries with inequalities in databases, the resilience computation was introduced and implemented. In order to reduce the computational time complexity in path conjunctive queries with inequalities, a Dynamic Programming for Resilience (DPResi) algorithm was proposed. Firstly, according to the properties of path conjunctive queries with inequalities and the max-flow Min-Cut theorem, the Min-Cut algorithm with polynomial time complexity was implemented. Then by compiling the lineage expression of a Boolean conjunctive query with inequality into a lineage graph, the problem of resilience was transformed into the computation of the shortest distance in lineage graph. Combining inclusion property with optimal substructure of the lineage graph, DPResi algorithm with linear time complexity was implemented by applying the idea of dynamic programming. Extensive experiments were carried out on the TPC-H datasets. The experimental results show that DPResi algorithm greatly improves the efficiency of resilience computation and has good scalability compared with Min-Cut algorithm.

Reference | Related Articles | Metrics

Select

Application case of big data analysis-robustness of a trading model

QIN Xiongpai, CHEN Yueguo, WANG Bangguo

Journal of Computer Applications 2017, 37 (3): 660-667. DOI: 10.11772/j.issn.1001-9081.2017.03.660

Abstract （535）

PDF （1417KB）（490）

Save

The robustness of a trading model means that the model's profitability curve is less volatile and does not fluctuate significantly. To solve the problem of robustness of an algorithmic trading model based on Support Vector Regression (SVR), several strategies to derive a unified trading model and a portfolio diversification method were proposed. Firstly, the algorithm trade model based on SVR was introduced. Then, based on the commonly used indicators, a number of derived indicators were constructed for short term forecasting of stock prices. The typical patterns of recent price movements, overbought/oversold market conditions, and divergence of market conditions were characterized by these indicators. These indicators were normalized and used to train the trading model so that the model can be generalized to different stocks. Finally, a portfolio diversification method was designed. In the portfolio, the correlation between various stocks, sometimes leads to great investment losses; because the price of the stock with strong correlation changes in the same direction. If the trading model doesn't predict the price trend correctly, then stop loss will be triggered, and these stocks will cause loss in a mutual accelerated manner. Stocks were clustered into different categories according to the similarity, and a diversified portfolio was formed by selecting a number of stocks from different clustered categories. The similarity of stocks, was defined as the similarity of the recent profit curves on different stocks by trading models.Experiments were carried out on the data of 900 stocks for 10 years. The experimental results show that the transaction model can obtain excess profit rate over time deposit, and the annualized profit rate is 8.06%. The maximum drawdown of the trading model was reduced from 13.23% to 5.32%, and the Sharp ratio increased from 81.23% to 88.79%. The volatility of the profit margin curve of the trading model decreased, which means that the robustness of the trading model was improved.

Reference | Related Articles | Metrics

Select

Entity relationship search over extended knowledge graph

WANG Qiuyue, QIN Xiongpai, CAO Wei, QIN Biao

Journal of Computer Applications 2016, 36 (4): 985-991. DOI: 10.11772/j.issn.1001-9081.2016.04.0985

Abstract （925）

PDF （1139KB）（673）

Save

It is difficult for entity search and question answering over text corpora to join cues from multiple documents to process relationship-centric search tasks, although structured querying over knowledge base can resolve such problem, but it still suffers from poor recall because of the heterogeneity and incompleteness of knowledge base. To address these problems, the knowledge graph was extended with information from textual corpora and a corresponding triple pattern with textual phrases was designed for uniform query of knowledge graph and textual corpora. Accordingly, a model for automatic query relaxation and scoring query answers (tuples of entities) was proposed, and an efficient top- k query processing strategy was put forward. Comparison experiments were conducted with two classical methods on three different benchmarks including entity search, entity-relationship search and complex entity-relationship queries using a combination of the Yago knowledge graph and the entity-annotated ClueWeb '09 corpus. The experimental results show that the entity-relationship search system with query relaxation over extended knowledge base outperforms the comparison systems with a big margin, the Mean Average Precision (MAP) are improved by more than 27%, 37%, 64% respectively on the three benchmarks.

Reference | Related Articles | Metrics

Select

Big data benchmarks: state-of-art and trends

ZHOU Xiaoyun, QIN Xiongpai, WANG Qiuyue

Journal of Computer Applications 2015, 35 (4): 1137-1142. DOI: 10.11772/j.issn.1001-9081.2015.04.1137

Abstract （459）

PDF （1039KB）（639）

Save

A big data benchmark is needed eagerly by customers, industry and academia, to evaluate big data systems, improve current techniques and develop new techniques. A number of prominent works in last several years were reviewed. Their characteristics were introduced and the shortcomings were analyzed. Based on that, some suggestions on building a new big data benchmark are provided, including: 1) component based benchmarks as well as end-to-end benchmarks should be used in combination to test different tools inside the system and test the system as a whole, while component benchmarks are ingredients of the whole big data benchmark suite; 2) workloads should be enriched with complex analytics to encompass different application requirements, besides SQL queries; 3) other than performance metrics (response time and throughput), some other metrics should also be considered, including scalability, fault tolerance, energy saving and security.

Reference | Related Articles | Metrics